Categories

Versions

You are viewing the RapidMiner Studio documentation for version 10.0 - Check here for latest version

Read Document (Text Processing)

Input

  • file (File)

    The file port.

Output

  • output

    The output port.

Parameters

  • fileName of the file to read the data from. Range:
  • extract_text_onlyIf checked, structural information like xml or html tags will be ignored and discarded. Range:
  • use_file_extension_as_typeIf checked, the type of the files will be determined by their extensions. Unknown extensions will be treated as text files. Range:
  • content_typeThe content type of the input texts Range:
  • encodingThe encoding used for reading or writing files. Range: